HEALPix logo

UXarray for Basic HEALPix Statistics & Visualization

In this section, you’ll learn:

  • Utilizing intake to open a HEALPix data catalog

  • Using the uxarray package to look at basic statistics over HEALPix data

  • Using UXarray plotting functionality on HEALPix data

Prerequisites

Concepts

Importance

Notes

UXarray

Necessary

HEALPix overview

Necessary

Time to learn: 30 minutes


import uxarray as ux
import cartopy.crs as ccrs

Open data catalog

Tip

If you think you first need to learn about Intake, Pythia’s Intake Cookbook is a great resource to do so.:::

This time, let us use a different catalog than the previous section, easy.gems for HEALPix Analysis & Visualization, and open the online catalog from the WCRP’s Digital Earths Global Hackathon 2025 using intake and read the output of the nextGEMS Cycle 3 ICON run ngc3028 as a Xarray.Dataset:

Tip

Despite using a different data catalog, this section will still showcase similar operations with the previous section, e.g. basic statistics and global and regional data plotting, except at the end where further grid exploration methods will be demonstrated.:::

import intake

# Final data catalog location (once hackathon website (https://digital-earths-global-hackathon.github.io/) updated)
#cat_url='https://digital-earths-global-hackathon.github.io/catalog/catalog.yaml'
# Interim data catalog location
cat_url='https://raw.githubusercontent.com/digital-earths-global-hackathon/catalog/refs/heads/ncar/online/main.yaml'
cat = intake.open_catalog(cat_url)
model_run = cat.icon_ngc4008

ds_coarsest = model_run().to_dask()
ds_fine = model_run(zoom=7).to_dask()
/home/runner/miniconda3/envs/healpix-cookbook-dev/lib/python3.10/site-packages/intake_xarray/base.py:21: FutureWarning: The return type of `Dataset.dims` will be changed to return a set of dimension names in future, in order to be more consistent with `DataArray.dims`. To access a mapping from dimension names to lengths, please use `Dataset.sizes`.
  'dims': dict(self._ds.dims),
/home/runner/miniconda3/envs/healpix-cookbook-dev/lib/python3.10/site-packages/intake_xarray/base.py:21: FutureWarning: The return type of `Dataset.dims` will be changed to return a set of dimension names in future, in order to be more consistent with `DataArray.dims`. To access a mapping from dimension names to lengths, please use `Dataset.sizes`.
  'dims': dict(self._ds.dims),
uxds_coarsest = ux.UxDataset.from_healpix(ds_coarsest)

uxds_fine = ux.UxDataset.from_healpix(ds_fine)
uxds_fine
<xarray.UxDataset> Size: 14TB
Dimensions:                              (time: 10958, depth_half: 73,
                                          n_face: 196608, level_full: 90,
                                          crs: 1, depth_full: 72,
                                          soil_depth_water_level: 5,
                                          level_half: 91,
                                          soil_depth_energy_level: 5)
Coordinates:
  * crs                                  (crs) float32 4B nan
  * depth_full                           (depth_full) float32 288B 1.0 ... 5....
  * depth_half                           (depth_half) float32 292B 0.0 ... 5....
  * level_full                           (level_full) int32 360B 1 2 3 ... 89 90
  * level_half                           (level_half) int32 364B 1 2 3 ... 90 91
  * soil_depth_energy_level              (soil_depth_energy_level) float32 20B ...
  * soil_depth_water_level               (soil_depth_water_level) float32 20B ...
  * time                                 (time) datetime64[ns] 88kB 2020-01-0...
Dimensions without coordinates: n_face
Data variables: (12/103)
    A_tracer_v_to                        (time, depth_half, n_face) float32 629GB ...
    FrshFlux_IceSalt                     (time, n_face) float32 9GB ...
    FrshFlux_TotalIce                    (time, n_face) float32 9GB ...
    Qbot                                 (time, n_face) float32 9GB ...
    Qtop                                 (time, n_face) float32 9GB ...
    Wind_Speed_10m                       (time, n_face) float32 9GB ...
    ...                                   ...
    vas                                  (time, n_face) float32 9GB ...
    w                                    (time, depth_half, n_face) float32 629GB ...
    wa_phy                               (time, level_half, n_face) float32 784GB ...
    zg                                   (level_full, n_face) float32 71MB ...
    zghalf                               (level_half, n_face) float32 72MB ...
    zos                                  (time, n_face) float32 9GB ...

HEALPix basic stats using UXarray

import matplotlib.pylab as plt

boulder_lon = -105.2747
boulder_lat = 40.0190

Mesh face containing Boulder’s coords

%%time
boulder_face = uxds_fine.uxgrid.get_faces_containing_point(point_lonlat=[boulder_lon, boulder_lat])
CPU times: user 8.99 s, sys: 113 ms, total: 9.1 s
Wall time: 8.97 s

Global and Boulder’s temperature averages

In order to get a line plot of our UXarray.UxDataset objects’ 1-dimensional temperature variables, we will convert them to xarray and call the default plot function because UXarray’s default plot functions are all dedicated to grid-topology aware visualizations:

Data spans from 2020 to 2050, so let us also consider slicing it to have something like 2020 to mid-2023, which would also give us similar plot to that with easy.gems in the previous section

time_slice = slice("2020-01-02T00:00:00.000000000", "2023-07-31T00:00:00.000000000")
%%time
uxds_fine.tas.isel(n_face=boulder_face).sel(time=time_slice).to_xarray().plot(label="Boulder")
uxds_coarsest.tas.sel(time=time_slice).mean("n_face").to_xarray().plot(label="Global mean")

plt.legend()
CPU times: user 392 ms, sys: 172 ms, total: 564 ms
Wall time: 4.33 s
<matplotlib.legend.Legend at 0x7f4dc5ed2fe0>
../_images/cf12aa7935bfc01912dc0ad9d165bab96a9e1a6f8f87de5b30dd4531e3757e47.png

Data plotting with UXarray

UXarray provides several built-in plotting functions to visualize unstructured grids, which can also be applied to HEALPix grids in the same interface:

Let us first look into interactive plots with bokeh backend (i.e. UXarray’s plotting functions have a backend parameter that defaults to “bokeh”, and it can also accept “matplotlib”)

Global plots

Let us first plot the global temperature (at the first timestep for simplicity), using the default backend, bokeh, of UXarray’s visualization API to create an interactive plot:

%%time

projection = ccrs.Robinson(central_longitude=-135.5808361)

uxds_fine["tas"].isel(time=0).plot(
    projection=projection,
    cmap="inferno",
    features=["borders", "coastline"],
    title="Global temperature",
    width=700,
)
CPU times: user 7.75 s, sys: 81.9 ms, total: 7.84 s
Wall time: 9.04 s
WARNING:param.GeoOverlayPlot00477: Due to internal constraints, when aspect and width/height is set, the bokeh backend uses those values as frame_width/frame_height instead. This ensures the aspect is respected, but means that the plot might be slightly larger than anticipated. Set the frame_width/frame_height explicitly to suppress this warning.

Now, let us create the same plot, now using matplotlib as the backend:

%%time

uxds_fine["tas"].isel(time=0).plot(
    backend="matplotlib",
    projection=projection,
    cmap="inferno",
    features=["borders", "coastline"],
    title="Global temperature",
    width=1100,
)
CPU times: user 630 ms, sys: 34 ms, total: 664 ms
Wall time: 2.36 s

Regional subsets (Not only for plotting but also for analysis)

When a region on the globe is of interest, UXarray provides subsetting functions, which return new regional grids that can then be used in the same way a global grid is plotted.

Let us look into USA map using the Boulder, CO, USA coords we had used before for simplicity:

Subsetting uxds_fine into a new UxDataset using a “bounding box” around Boulder, CO first:

%%time

lon_bounds = (boulder_lon-20, boulder_lon+40)
lat_bounds = (boulder_lat-20, boulder_lat+12)

uxds_fine_subset = uxds_fine["tas"].isel(time=0).subset.bounding_box(
    lon_bounds, 
    lat_bounds
    )
/home/runner/miniconda3/envs/healpix-cookbook-dev/lib/python3.10/site-packages/uxarray/grid/grid.py:1432: RuntimeWarning: Necessary functions for computing the bounds of each face are not yet compiled with Numba. This initial execution will be significantly longer.
  warn(
CPU times: user 1min 7s, sys: 157 ms, total: 1min 8s
Wall time: 1min 5s

If we quick check the global and regional subset’s average temperature at the first timestep, we can see the difference:

print("Regional subset's temperature average: ", uxds_fine_subset.mean("n_face").values, " K")
print("Global temperature average: ", uxds_fine.tas.isel(time=0).mean("n_face").values, " K")
Regional subset's temperature average:  281.8399  K
Global temperature average:  286.30957  K

Now, let us plot the regional subset UxDataset:

%%time 
projection = ccrs.Robinson(central_longitude=boulder_lon)

uxds_fine_subset.plot(
    projection=projection,
    cmap="inferno",
    features=["borders", "coastline"],
    title="Boulder temperature",
    width=1100,
)
CPU times: user 67.9 ms, sys: 0 ns, total: 67.9 ms
Wall time: 67.5 ms

Grid topology exploration

Exploring the grid topology may be needed sometimes, and UXarray provides functionality to do so, both numerically and visually. Each UxDataset or UxDataArray has their associated Grid object that has all the information such as spherical and cartesian coordinates, connectivities, dimensions, etc. about the topology this data belongs to. This Grid object can be explored as follows:

uxds_fine.uxgrid
<uxarray.Grid>
Original Grid Type: HEALPix
Grid Dimensions:
  * n_node: 196610
  * n_face: 196608
  * n_max_face_nodes: 4
Grid Coordinates (Spherical):
  * node_lon: (196610,)
  * node_lat: (196610,)
  * face_lon: (196608,)
  * face_lat: (196608,)
Grid Coordinates (Cartesian):
  * node_x: (196610,)
  * node_y: (196610,)
  * node_z: (196610,)
Grid Connectivity Variables:
  * face_node_connectivity: (196608, 4)
Grid Descriptor Variables:
  * n_nodes_per_face: (196608,)

There might be times that the user wants to open a standalone Grid object for a HEALPix grid (or any other unstructured grids supported by UXarray) without accessing the data yet, which can also be achieved as follows:

# Let us open a pretty coarse HEALPix grid because we will visualize it later
uxgrid = ux.Grid.from_healpix(zoom=3, pixels_only=False)
uxgrid
<uxarray.Grid>
Original Grid Type: HEALPix
Grid Dimensions:
  * n_node: 770
  * n_face: 768
  * n_max_face_nodes: 4
Grid Coordinates (Spherical):
  * node_lon: (770,)
  * node_lat: (770,)
  * face_lon: (768,)
  * face_lat: (768,)
Grid Coordinates (Cartesian):
Grid Connectivity Variables:
  * face_node_connectivity: (768, 4)
Grid Descriptor Variables:
uxgrid.plot(
    periodic_elements="split",         
    projection=ccrs.Mollweide(),
    width=800,
    title="HEALPix, zoom=3")

What is next?

The next section will provide an UXarray workflow that loads in and analyzes & visualizes HEALPix data.